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Abstract. In this paper, we generalise the famous algorithm for swapping the contents of two variables without using a 
buffer. We introduce a novel combinatorial framework for procedural programming languages, where programs are only allowed 
to update one variable at a time. We first consider programs which do not have any memory. We prove that any function 
of all the variables can be computed this way in a number of updates which grows linearly with the number of variables. 
Similarly, any linear function can be computed using a linear number of linear instructions. We then derive the exact number 
of instructions required to compute any manipulation of variables. This shows that the idea of combining variables instead of 
simply moving them around not only allows for memoryless programs, but also yields shorter programs. Second, we show that 
allowing programs to use memory is also incorporated in our framework. We quantify the gains obtained by using memory. 
This leads to shorter programs and allows us to use only binary instructions, which is not sufficient in general when no memory 
is used. 
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1. Introduction. How do you swap the contents of variables x and y using a procedural programming 
language? The common approach is to use a buffer t, and to do as follows (using pseudo-code). 

t X 
X 

However, a famous programmer's trick consists in using XOR, which we view as addition over a binary 
vector space: 

X X + y 
y^x + y 
X -i^ X + y. 

We thus perform the swap without any use of memory. Our aim is to generalise this idea to compute 
transformations without memory. 

While the example described above is folklore, the idea to compute functions without memory was 
developed in [U [3l HI [5j [6l [7| for the case of boolean variables. We would like to emphasize the novelty of the 
results of this paper and how they differ from those in the literature. First, the results presented in this paper 
generalise those given in the literature, as we consider any finite alphabet while only the binary alphabet 
was usually considered in the literature. Second, we provide simpler proofs, which is especially true for 
Theorems 12.41 and 13. 131 Third, we also give some matching upper and lower bounds which are absent in the 
literature, e.g. in Theorem 13.51 Fourth, many aspects considered here, such as the study of manipulations 
of variables in Section 14.31 the use of binary instructions in Theorem 15.81 and the use or memory in Section 
[5l are completely novel. 

2. Combinatorial model for memoryless computations. 

2.1. Instructions and programs. We formalise our ideas as follows. Let ^ be a finite set, referred 
to as the alphabet, of cardinality q and let n be a positive integer (without loss, we shall usually regard A 
as Zq or GF(g) when g is a prime power). The cases where q — 1 or n — I being trivial, we shall assume 
q > 2 and n > 2 henceforth. We refer to any element of A" as a word. We view any transformation / of 
A" (i.e., / : A" — > A") as a tuple of functions / = (/i, . . . , /„), where fi : A" -> A is referred to as the 
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i-th coordinate function of /. In particular, a coordinate function is trivial if it is equal to the identity, i.e. 
fi{x) = Xf, it is nontrivial otherwise. The size of the image of / is referred to as its rank. When considering 
a sequence of transformations, we shall use superscripts, e.g. /'^ : A" — )• A"' for all fc-and hence f'^ shall 
never mean taking / to the power k. 

Definition 2.1 (Instruction). An instruction is a transformation g of A^ with at most one nontrivial 
coordinate function gi. We say that the instruction updates yi for y = (j/i, . . . , e ^4" and we denote it as 

Vi ^ 9i{y)- 

A permutation instruction is an instruction which maps A" bijectively onto A" (i.e. is a permutation of 

By convention, the identity is an instruction, which can be represented by yt -s— yi for any I < i < n. 

We denote the set of instructions of A" as T{A") and the set of permutation instructions as X(A"). We 
shall simply write T and I when there is no ambiguity. For instance, if ^ = GF(2) and n = 2, then I is 
given by 

{{xi,X2), {xi + 1,X2), {xi +X2,X2), {xi + X2 + l,X2), {xi,X2 + 1), (a;i,a;i +X2), {xi,xi +X2 + 1)}. 
In update form, I can be written as follows: 

{yi^yi, + yi^yi+y2, 2/1 ^ yi + j/2 + 1, 

2/2 ^ 2/2, 2/2 y2 + 1, 2/2 2/1 + 2/2, 2/2 2/1 + 2/2 + 1}, 

where the identity is represented by yi <— yi and 2/2^2/2- 

Definition 2.2 (Program). For any transformation f of A", a program of length L computing f is a 
sequence of instructions g^,. . . ,g^ such that 

f = g'^o...og\ 

We shall write the instructions of a program in their update form one below the other. Although the 
identity is an instruction, any instruction in a program is not the identity unless specified otherwise. Also, 
since the set of instructions updating a given coordinate is closed under composition, without loss we can 
always assume that 5*^+^ updates a different coordinate than g'^ for all k. 

We consider a basic procedural programming language which has a finite number of inputs x = (xi , . . . , a;„) G 
A" and only allows programs of the form described above. Therefore, it only allows in-place calculations, 
without loops, pointers, and more importantly without any memory. We use y = (2/1, . . . , 2/ra) to represent 
the content of the registers during the program. Hence y = x before the first instruction, and y = f{x) after 
the last instruction. Note that we will ako uhc; the shortcut notation yi <— h{x) to reflect how the content of 
the memory relates with the program input. In particular, note that the last update of yi must be 

Vi ^ fi{x). 

To be absolutely rigorous, we should let y take into account the instruction number: = x,y^,. . . ,y^ = 
f{x), where L is the length of the program. However, our calculations will not require such level of rigour, 
and we simply use y instead. 

In order to illustrate our notations, let us rewrite the program computing the swap of two variables, i.e. 
f : ^ where f{x\.,X2) = {x2,xx). It is given as follows: 

2/1 2/1 + 2/2 (= xi + X2) 
2/2 <- 2/1 - 2/2 (= xi) 
yi 2/1 - 2/2 (= X2). 

Definition 2.3. Let B, C be two alphabets and f,g:B^C. We say g dominates / if and only if 

g{x) = g{x') ^ fix) = fix') 
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for all x,x' E B. In other words, f = ho g for some transformation h. 

A program for / induces a sequence of transformations /i^, . . . , ft,^ — f where h^ is an instruction, /i' 
and differ in only one coordinate, and /i* dominates for all i. Indeed, simply let h'^'^^ = g^~^^ o /i'; 
equivalently h^ represents the content of y after the i-th instruction of the program. In particular, if / is a 
permutation, then all intermediate transformations must be permutations as well. 

We remark that our programming language only allows to return one output: the transformation / 
computed by the program. However, it may be fair to ask the program to sequentially return outputs. This 
can be incorporated in our framework if all the outputs are permutations. However, the case of general 
transformations is more troublesome: for instance, if we ask to return f^{xi,X2) = (a;i,a;i + 1) and then 
f'^{xi,X2) — {x2,X2 + 1), then it is clear that cannot be computed after In general, a program can 
sequentially compute . . . , f^ only if /* dominates Z*"*"^ for all 1 < i < if — 1 (our results will show that 
this is necessary and sufficient). Therefore, this program can be broken down into K shorter programs, each 
computing one output. In view of these considerations, we shall only consider programs which compute one 
output transformation / in the remaining of this paper. 

2.2. All transformations are computable without memory. We are now interested in the general 
case of computing any transformation of n variables. We first prove in Theorem 12 .41 that any transformation 
can be computed. Although the program in the proof has exponential length, we shall prove that any 
transformation has a program of linear length. 

We introduce some useful notations for any words u,v € A". First, the transposition of u and u, denoted 
as (u,v), is the permutation of A" which maps utov,v to u, and fixes any other word in A"'. Second, the 
assignment oi u to v, denoted as (w — > u), is the transformation which maps m to u and fixes any other word 
in A^. Third, we denote the all-zero word as e° and the fc-th unit word as e'^ S A"-, where = 6{i, k) and 
5 is the Kronecker delta function. 

Theorem 2.4. Any transformation of A" can be computed by a program which only consists of transpo- 
sitions {u,v) where v — u + e'' for some i and the assignment (e" — )■ e^). These instructions are respectively 
represented by 

Vt ^ yt + S{y,u) - S{y,v), 
yi^yi + 5{y,e°). 

Proof. First of all, a generating set of Sym(A") together with any transformation of rank q" — 1 generates 
all transformations Theorem 3.1.3]. Since the assignment (e° — ?► e^) is an instruction of rank — 1 
(clearly represented in the bottom row above in update form), we only need to generate Sym(j4"). 

Order the words of A" according to the Gray code in |13| . then any two consecutive words v-' and v^'^^ 
satisfy v^^^ = ± e'^ for some ij. The Coxeter generators {{v^,v^^^) : 1 < j < — 1} corresponding to 
this ordering thus are instructions, e.g. if v^^^ ~ + e*, then («■', u^+^) is represented by 

Vi ^ yi +5{y,v^) - (5(y,w'+^). 

□ 

Our framework is particularly interesting for computing using registers only, or equivalently without 
requiring to use primary memory. The instructions in Theorem 12.41 are encoded in assembly in Figure 12.11 
The instructions are explained as follows. 

• bne y a 1 (branch not equal) will jump to the instruction labelled by 1 if y ^ a. 

• addiy6 (add immediate) adds b to the value stored in y (without carry-out). 

• j 1 (jump) jumps to the instruction labelled by 1. 

3. Procedural complexity. Definition 3.1 (Procedural complexity) . The shortest length of a pro- 
gram computing f is referred to as the procedural complexity of f and is denoted as C{f). By convention, 
the identity has procedural complexity 0. 

We have C{fog) < C{f) -\-C{g) for any two transformations / and g. Furthermore, if / is a permutation, 
then it is easy to show that C{f~^) = C{f). We then obtain that 

dif,g) :=£(/o5-i) 
3 
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Fig. 2.1. Encoding the instructions in Theorem\2.4\ 



defines a metric on the symmetric group of A". Tfiis is indeed tlie word metric, witlr generators given by all 
the permutation instructions. 

We would like to emphasize that the procedural complexity strongly differs from other measures seen in 
complexity theory. For instance, the procedural complexity of any decision problem is simply 1, for it can 
be expressed as computing the instruction whose value is 1 if the instance has an affirmative answer and 
otherwise. Also, the procedural complexity is based on the set of all instructions, and not only on circuits 
formed of certain types of gates. Therefore, each instruction can be arbitrarily "complex." 

3.1. Procedural complexity of permutations. The main purpose of this section is to prove that 
the maximum procedural complexity of a permutation in Sym(A") is 2n — 1, which is independent from the 
cardinality of the alphabet A. 

Proposition 13.21 below shows that this quantity is at least 2n — 1. It is remarkable that the permutation 
which maximises the procedural complexity is very "simple" to describe; this fact highlights the difference 
between the procedural complexity and other complexity measures. 

Proposition 3.2. The procedural complexity of the transposition (a, 6) of two words a,b € A" is 2d — 1 
instructions, where d is the Hamming distance between a and b: d — \{i : Oi ^ bi}\. 

Proof. Without loss, let a and & disagree on their d first coordinates. Denoting?;'^ = (bi, . . . ,bk,ak+i, ■ ■ ■ ,an) 
for 1 < k < d, we obtain 

(a, b) = (a, o . . . o {v'^-^, w'^^i) o {v'^-\b) o---o{v\v^)o (a, v^). 

Each transposition involves words differing in at most one position, and hence is an instruction. For instance, 
(a, v^) is the instruction 

yi ^ Vi + {bi - ai) {6{y, a) - 6{y, u^)) . 

Therefore, the procedural complexity is at most 2d — 1 instructions. 

Conversely, suppose that there exists a program computing (a, b) with fewer than 2d — 1 instructions. 
In that program, at least two coordinates are only updated once (say i before j). Denote the images of a 
and b before the update of yj as a' and 6', respectively. Note that a'^ — bi and 6- = a^, since yi will not be 
updated any further. The update of yj is given by 

Vj ^ V] + (bj - a'j)iS{y, a) ~ (5(?/, b')), 

since coordinate j cannot be modified for any program input other than a or b, and it must indeed give the 
correct values for these two inputs. However, this update is not bijective, for a' and b' differ in coordinate 
i. □ 
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To prove an upper bound on the procedural complexity, we need to study the properties of functions. We 
use the terminology of pj . Although this upper bound was proved in f4] , we give an alternate proof below, 
which connects the topic of this paper to the study of coordinate functions and combinatorial representations 
from 9 . 

Definition 3.3. Let B,C be two alphabets. A function f : B x C ^ B is balanced if |/~^(&)| = |C| 
for all b e B. 

It is easily shown that for any two functions f : B x C ^ B and h : B x C ^ C , {f , h) is a permutation 
of i? X C if and only if / is balanced and = C for all b € B [9j. 

Proposition 3.4. For any pair of balanced functions f,g:BxC^B, there exists h : B x C ^ C 
such that {f^h) and {g,h) are permutations of B x C . 

Proof. Let G be the bipartite graph with vertex set given by two copies of B and with |(/, j)| 
edges between i and j. Since / and g are balanced, G is |C|-regular and hence its edges are C-colourable. 
Let h be such colouring. Then for all i £ B, we have h{f~^{i)) = Ujes ^{{fi9)~^^{hj)) = C* and similarly 
h{g^^{j)) = C. This is equivalent to (/, ^) and {g,h) being permutations. □ 

Theorem 3.5. The maximum procedural complexity of a permutation of A" is 2n — 1 instructions. 

Proof. Proposition 13.21 shows that the maximum complexity is at least 2n — 1. We then prove that any 
permutation / can be computed by a program with at most 2n — 1 instructions. 

We prove the following claim: for any 1 < fc < n — 1, there exists a function hk : A" A oi x such 
that {hi, . . . , hk,Xk+i, . . . ,Xn) and {hi, . . . ,hk, fk+i, ■ ■ ■ , fn) are permutations. This is clear for fc = 1: apply 
Proposition [331 to (/2j • • • i fn) and {x2, . . . , Xn)- Let us assume it is true for up to fc — 1, then by hypothesis, 
g'^ := {hi, Xfc+i, ...,Xn) and g'^ := {hi, hk-i,fk+i, •••,/«) are both balanced functions from 
A" to A"-~-^ (since {g^,Xk) and {g'^,fk) are permutations, respectively). Applying Proposition 13.41 to these 
functions then proves the claim. 

The program then proceeds as follows: 

• Step 1. For fc from 1 to n — 1, do ^ hk{x). 

• Step 2. For fc from n down to 1, do yk fk{x). 

□ 

We can represent computations of any permutation of A" as progressing around the Cayley graph |12) 

Cay(Sym(A"),I). 

The set of permutation instructions I C Sym(v4") is described as follows. Let g be the instruction yi <— gi{y). 
Then in view of the remarks made after Definition 13. 31 g is a permutation if and only if gi : A" — >■ A satisfies 

gi{{u e A" : {ui, . . .,Ui^i,Ui+i, ...,«„)= v}) = A 

for all V € A"^~^. There are hence q\ choices for the reduction of gi to each pre-image, and hence {qiy 
choices for gi. Since the identity has been counted n times, there are 

\I\ = n{qiy"'' -n+1 

instructions. 

Note that the inverse of g is given by the instruction h which also updates the i-th coordinate and 
satisfies 

hi {xi , . . . , Xi— I , gi {x) , Xi-\-i , . . . , Xji ) — Xi. 

Therefore, the set of permutation instructions updating a given coordinate forms a group, isomorphic to 
Sym(A)9"". 

We have determined the maximum procedural complexity in Theorem l3.5l We are now interested in the 
expected complexity. Proposition 13.61 gives a lower bound on that quantity. 

Proposition 3.6. The proportion of permutations with computational complexity at least 

nlogg- 1 ^ ^ 

q^^ logg! + <7^" lognj 
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tends to 1 when n tends to infinity. 

Proof. Any transformation with procedural complexity I can be expressed as a product of I instructions. 
Therefore, the number of permutations with procedural complexity at most I is no more than the number 
of ^-tuples of permutation instructions, given by We have 

\I\ < niqlf'-" 

= exp(logn + g""4ogg!), 
|Sym(A")| g"! > y/2^q"''" exp(-g") 

= V2^exp(g"(nlog<z-l)). 

Denoting B — ^-i log^gi^^-i ipgn obtain |Sym(yl")| > y^2T:q"\I\^ and hence the proportion of permu- 
tations with procedural complexity at most [i?J is upper bounded by 

|Sym(A")| ^ V2^' 

which tends to zero. □ 

In particular. Proposition 13.61 shows that for n large, almost all permutations of GF(2)" have computa- 
tional complexity at least 2n — 2. However, the bound in Proposition 13.61 decreases with q. 

We now show how the problem of determining the procedural complexity of a given permutation can be 
reduced to the case of so-called ordered permutations for nearly all permutations. 

Definition 3.7 (Ordered function). Let A and be ordered (say, using the lexicographic order). For 
any balanced function fi : A" — > A and any a <E A, we denote the minimum element of f~^{a) as m{a). We 
say fi is ordered if m(0) < m(l) < . . . < m{q — 1). 

Any function fi : A" A can be uniquely expressed as 

fi^CTlO fi 

where (Xi G Sym(j4) and /* is ordered. In this case, we say that fi is parallel to f* [9]. 

By extension, we say that / is ordered if all its coordinate functions are ordered. Therefore, to any 
permutation /, we associate the ordered permutation /* where fi — Ui o f* for some cti, . . . , tT„ G Sym(A). 

Proposition 3.8. There exists a shortest program computing f* using only ordered instructions. Fur- 
thermore, its length satisfies 

C{f*)<C{f)<C{f*)+T{f), (3.1) 

where T{f) is the number of nearly trivial (parallel to the trivial coordinate function) coordinate functions 
off: 

Tif)^\{i:f:^x,,f,^x,}\. 
Proof. The proof of the different claims all use the idea of converting programs. 

We first prove that there exists a shortest program computing /* using only ordered instructions. Let 
/* = 5^ o ... be a shortest program computing /* . We can easily convert it to another program h^ o . . .oh^ 
using only ordered instructions as follows. First let h^ = g^*. Then before g^' , we can express the content of 
the i-th cell as yi = Pi o y* for all 1 < i < n. Replace the instruction yi gf (y) by 

yi ^ hl{y) = Tgl{pi o yi, . . . ,p„ o y„), 

where r S Sym(A) guarantees that the instruction /i-' is indeed ordered. It is easy to check that converting 
all instructions in this fashion does yield a program computing /*. 

We now prove that £(/*) < C{f). Consider a shortest program g^ o . . . o g^ computing / and convert it 
as follows to compute /*. First, replace any final update yi -fr- fi{x) by yi ^ o fi{x) — fi{x). Second, 
after this final update, replace any occurrence of yi by aiyi. 
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Wc finally prove that C(f) < C{f*) + T{f). Consider a shortest program o . . . o computing /* 
(note that it may or may not update any of the coordinates yi for which /j is nearly trivial) and convert it as 
follows to compute /. First, replace any final update yi by yi ^ cno fi{x) — fi{x). Second, after 

this final update, replace any occurrence of yi by cr~^yi. Third, update the eventual nearly trivial coordinate 
functions which have not been updated yet (there are at most T(f) of them). □ 

3.2. Procedural complexity of all transformations. We have shown that any permutation of A" 
can be computed in 2n — 1 memoryless instructions. We have also shown that any transformation can 

be computed by some memoryless program. The aim of this section is to derive an upper bound on the 
procedural complexity of any transformation which only depends on n; this upper bound turns out to be 
4n — 3 instructions. 

Definition 3.9 (Lexicographic order). For any a = (ai,...,a„) G A" (A = "Lq), we define the 
lexicographic order of a as the integer aiq^~^. For the sake of conciseness and clarity of notation, we 

shall abuse notation and identify a with its lexicographic order. 

In the lexicographic order, the all-zero word is in zero-th position, then (1,0,..., 0) is in first, (0, 1,0,.. .) 
is in q-th position, and so on until {q — 1, . . . ,q — 1) in last and (g" — l)-th position. The lexicographic order 
yields the concept of interval, defined below. 

Definition 3.10. An interval of A"- is any subset of the form 



[b, c]:={xeA^:b<x<c} 



for any 0<b<c<q"'~l. 

Recall that an integer partition of an integer s is a sequence of positive integers whose sum is equal to 
s. Although the terms in the sequence are usually sorted in decreasing order, we do not do so in this paper. 

Definition 3.11. For any integer partition A = (Ai, . . . , A/j) o/g", is defined to be the transformation 
of A" such that 



/([0,Ai-l])=0, 



.j=l i=l 



j — 1 for all 1 < j < k. 



Proposition 3.12. Let f be a permutation of A^ which can be computed as a product ofn instructions 
updating yi to y„. Then for any integer partition A o/g", the transformation g = fop^ can also be computed 
as a product of n instructions updating yi to yn ■ 

Proof. In order to simplify notations, we denote p^ as p. We first prove the following claim: if a > 6 € A" 

agree on coordinates z to n for some i, then p(a); = p{b)i for all I > i. 

Proof of claim: We have p{c + 1) G {p(c),p{c) + 1} for any < c < — 1 and hence 

< p{a) - p{b) <a-b. 

Therefore, if a; = bi for all I > i, then a — b < q^~^, which yields p{a) — p{b) < q^~^ and hence these two 
words agree on positions from i to n. 

Let / = /" o • • • o where /' is an instruction updating yi for all i. Let be the transformation 
obtained after the instructions ym gm{y) for m from 1 to we have 

g'{x) = {gi{x), . . . ,gi{x),Xi+i, . . . ,Xn). 

Then we only need to prove that for all 1 < i < n — 1 and all a > 6 e A", 

g\a)=g'{b)^g{a)=g{b). 

For any m < i, we have g^^ = g^ = (/ o p)^ = fm o P- Therefore, 5' (a) = 5' (6) if and only if 

fm{p{a)) = fm{p{b)) for all m < i and ai = bi for all / > i + 1. By the claim above, we obtain p{a)i = p{b)i 
for ain > z + 1. Thus g^{a) — g^{b) implies h{p{a)) — h{p{b)), where 

h{x) = if o ■ ■ ■ o f^){x) = {fi{x), . . . , fi{x),Xi+i, . . . ,Xn)- 

7 



Since /i is a permutation, we obtain p{a) = p{b) and hence g{a) = g{b). □ 

Theorem 3.13. Any transformation of can be computed in at most 4n — 3 instructions. 

Proof. Let / be a transformation of A" and consider the integer partition A induced by its pre-images: 

denote /(A") — {ai, . . . , a^} and Xi — |/~^(ai)l f^^' aU 1 < i < fc. Then / can be expressed as 

f = hop^ og, 
where g and h are permutations of A" satisfying 



.2=1 i = l 



h{j - 1) = aj 

for aU 1 <j <k. 

By Theorem l3.5[ g and h can be computed as follows, where the superscript indicates which coordinate 
is updated by each instruction: 

9^g'^o---og"-'^og'^o---og\ 
h = h^o---oh''-^oh''o---oh\ 

By Proposition 13.121 the transformation o ■ ■ ■ o o p^ can be computed in n instructions o • • • o . 
Furthermore, pi and g^ being instructions updating yi, their product = p^ o g^ is another instruction 
updating yi. Thus, / can be computed by the following program of length 4n — 3: 

/71 Hn—l n 2 1 —2 -n — 1 n 1 

= tl o ■ ■ ■ o h op o ■ ■ ■ o p o q o g o ■ ■ ■ o g o g o ■ ■ ■ o g . 

□ 

We conclude this section with a remark on infinite alphabets. If A is infinite, there exists a bijection 
h : A" A and thus any transformation can be computed in n + 1 instructions by the following program: 

Vn ^ h{y) 
yi ^ fi{x) 



Vn ^ fn{x). 

4. Computing linear transformations. 

4.1. Program computing linear transformations. We are now concerned with the case where q is 
a prime power and the inputs elements of a finite field A = GF(g), and we want to compute 

a linear transformation / of A", i.e. 

fix) = Mx'^ 

for some matrix M G ^"X" Each coordinate function fi of / can be viewed as the inner product of a 
row of M with the input vector x. Therefore, we shall abuse notations slightly and refer to that row as ff. 
fi{x) = fi ■ x. In this section, we restrict ourselves to linear instructions only, i.e. instructions of the form 

n 

for some a = (ai, . . . , a„) € A" . 

This is equivalent to calculating the matrix M as a product of matrices M = Mi . . . Ml, where Mi is a 
matrix which only modifies one row. If M is nonsingular, this is also equivalent to a sequence of matrices 
-^0 = In, Ni, . . . , Nl^i,Nl = M where Ni is nonsingular and iV^ and Ni^i only differ by one row for all i. 



Gaussian elimination indicates that any matrix can be computed by linear instructions involving only 
two rows. The number of such instructions required to compute any matrix is on the order of n^. However, 
since we allow any linear instruction involving all n rows, we can obtain shorter programs. Theorem 14.11 
shows that all matrices can be computed in a linear number of instructions. 

Theorem 4.1. Any nx n nonsingular matrix M can he computed by at most 2n—l linear instructions. 
Furthermore, this can be done in two main steps: 

• The first step updates row i for i from 1 to n — 1 to produce an upper unitriangular matrix (i.e., 
with ones on the diagonal) 

• The second step updates row i for i from n down to 1 to produce M . 

In general, any n x n matrix with rank p > 1 can be computed in n + p — 1 linear instructions. 

Proof. The proof of correctness of the algorithm for nonsingular matrices actually goes in reverse: we 
start from M and construct the identity matrix. We first justify the first step of the reversed algorithm: 
M can be triangularised in n instructions. Let us prove that after k instructions we can obtain a matrix 
Mk where the upper left k x k submatrix is upper unitriangular, by induction on A; (1 < fc < n). We shall 
consider the {n—l)xk matrix N formed by the first k columns and all but the fc-th row of Mk-i {Mq — M). 
For fc = 1, we need to consider two cases: 

1. The (1, 1) entry of M is nonzero, then scaling the first row will work: 

2/1 <-Af(l,l)-iyi. 

2. Otherwise, there exists a non-zero element M(j, 1), then do 

yi^yi+M{],l)-^yj. 

Now assume the claim holds for fc — 1. Once again, we distinguish two cases on N: 

1. The unit vector e^ = (0, . . . , 0, 1) G is not in the row span of N . Then simply replace row yk 
with e*^ e A". 

2. Otherwise, by hypothesis the fc x fc matrix whose rows are given by the first fc — 1 rows of N together 
with e^ is upper unitriangular. Therefore, N has full rank and there is a linear combination of rows 
vN satisfying 

vN +{ykf = e^ 

where {yk)^ are the first fc positions of y^. Therefore, denoting all but the fc-th row of M^-i as N', 
perform 

yk ^ yk + vN'. 

We now prove the second step: any upper unitriangular matrix can be turned into the identity matrix 
in n — 1 instructions. Let us prove that we can obtain a matrix whose last fc rows are equal to those of the 
identity matrix in fc — 1 instructions {1 < k < n). For fc = 1, this is trivial. Suppose it holds for fc — 1 and 
denote yn-k+i = (0, . . . , 0, 1, a„_fe+2, ■ • ■ , then perform 

n 

y-n-k+i ^ yn-k+1 " ^ aiyt- 

i=n-k+2 

We now consider matrices with rank 1 < p < n. We prove the claim by induction on n, the claim being 
clear for n = 1. Assume it is true for up to rt — 1 and let us compute the matrix M E Without loss 

of generality, let the first p rows of M be linearly independent and them as {N\P) e A^^", where N has p 
columns and P has n — p columns. 

By hypothesis, there is a program with length at most 2p — 1 which can compute N. Suppose N 
has fc rows equal to those of the identity matrix. Then it is easily shown that there exists a program which 
computes iV in no more than 2{p— fc) — 1 instructions. This program can be appended by fc trivial instructions 
Vj ^ Vj obtain a program which computes N, which updates all rows, and which has no more than 2p~l 
instructions in total. 
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where 



Wc adapt this program so that it computes {N\P) as fohows. Suppose that yj -s— fj is the first final 
update therein. Then applying the program, it should yield the j-th row of N for the first p coordinates and 
the all-zero vector for the last n — p coordinates. However, the j-th row of P, say Vj , can be expressed as a 
linear combination of the last n — p rows of the n x n identity matrix. Therefore, simply replace yj fj by 

Subsequently, replace any occurrence of yj by yj — Vj in the program. Do this operation for all rows, and 
we obtain a program which computes {N\P) in at most 2/7—1 instructions. Finally, all other n — p rows of 
M can be expressed as linear combinations of the first p, so it only takes n — p final updates. □ 

4.2. Further results for nonsingular matrices. Let us characterise the set A^(GF(q)") of invertible 
linear instructions. It is given by the set of nonsingular matrices with at most one nontrivial row: 

M = {S{i,v) :l<i<n,v€ A"(i)}, 

A"{i) = {v€ A", Vi 0} for all 1 < i < n, 

/ h-i I \ 

S{i,v) = V G 

\Ojl~J 

Remark that S{i,v)~^ = S{i,—v~^v) for all i,v and 

\M\=nq''-\q-l)-n+l. 
Computing a nonsingular matrix is hence equivalent to progressing around the Cayley graph 

G:= Cay(GL(n,9),A^). 

Our previous results imply that G is undirected and connected. Since it is a Cayley graph, it is vertex- 
transitive and in particular it is regular of valency — 1 = n{q" — — 1). The following are equivalent: 

1. M and N are adjacent in G. 

2. M = S{i, v)N and N = S{i, -v~'^v)M for some i and v e A"{i). 

3. M and N only differ in one row. 

Therefore, G is the subgraph of the Hamming graph H{n, q") induced by the general linear group. 

The diameter of G is of great interest as it gives the maximum procedural complexity C'{M) of computing 
a nonsingular matrix by updating one row at a time. We know that it is no more than 2n — 1; we shall 
see that it is at least [^J (and hence it is equal to 3 when n = 2) but it remains unknown for n > 3. 
However, when the field A is large, then almost all n x n matrices can be computed in no more than n linear 
instructions. 

Proposition 4.2. There are exactly 

nx n nonsingular matrices over GF(g) which can be computed simply by updating their rows from 1 to n in 
increasing order. 

Proof. Let us count such matrices M with rows fi. After the first instruction, we obtain the matrix 

whose first row is equal to fi , while the last n — 1 rows do not depend on the matrix we are computing and 
are equal to (0|J„_i). Then fi can be chosen as any vector not in the span of the last n — 1 rows: there are 
hence {q — l)q^~^ choices for /i. Once /i is fixed, similarly there are {q— l)^""^ choices for /2, and so on. □ 

Similar to the general case, we can reduce the problem of determining the complexity of nearly any 
nonsingular matrix to the case of so-called scaled matrices. Note that this concept is not necessarily consistent 
with the concept of ordered permutations; however, it can be viewed as an analogue. 

Definition 4.3. A nonzero vector whose leading nonzero coefficient is equal to 1 is said to be scaled. 
A nonsingular matrix is scaled if all its rows are scaled. 
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Fig. 4.1. Representing a transformation via a graph. 



For instance, the identity matrix is the only scaled diagonal matrix. For any nonzero vector v € GF{q) 
with leading nonzero coordinate Vj, we can express 



for a unique scaled vector v*. For any nonsingular matrix AI with rows fi, let M* be the corresponding 
scaled matrix with rows /* . We obtain the linear analogue of Proposition 13.81 

Proposition 4.4. There exists a shortest linear program computing M* with only scaled instructions. 
Its length satisfies 



where T'{M) is the number of nearly trivial (equal to multiples of the corresponding unit vectors) rows of AI : 



4.3. Manipulating variables. We generalise the example of swapping two variables by considering 
any manipulation of variables. We distinguish between a transformation (/) of [n] (where we denote [n] = 
{1, . . . ,n}) which represents the formal movement of variables and the transformation of it induces 
on all the possible values of the variables. Although we do not require that q should be a prime power, in 
such a case a manipulation of variables is indeed a linear transformation. Remark that G Sym{A") if and 
only if (/) G Sym(n). We always use the postfix notation for (/), i.e. the image of i under is denoted as icf). 
For : [n] — )■ [n], (j)^ does represent the fc-th power of (j) according to composition. 

Definition 4.5. A manipulation of variables is a transformation of A'^ such that there exists a 
transformation (j) of [n] for which 



for all X e A^. 

The transformation can be represented using a directed graph on [n] with n arcs {i,i4>) (see Jll! for a 
detailed review of this representation of transformations). This directed graph has cycles of two kinds: 

• A cycle (i, i(f>, . . . , i(j>'^~^) (where icj)'^ = i) is detached if for all < Z < fc — 1, there is no ji ^ i(j)''~^ 
such that = icj)^ . Equivalently, the cycle is an entire connected component of the graph. 

• A cycle (i, icf), ... , i4)'^~^) is attached otherwise, i.e. if there exists < / < fc — 1 and j e [n],j ^ iip''~^ 
such that = i(j)''. 

Note that if is a permutation, then all its cycles are detached. 

For instance, consider : [6] — > [6] defined as 10 = 2, 2ip — 3, 30 = 1, 40 = 2, 50 = 6, 60 = 5. Then the 
cycle (1,2,3) is attached to 4, while the cycle (5,6) is detached, as seen on Figure \AA\ 

Example. Let us first consider the case of a cyclic shift of three variables, i.e. vr = (1,2,3) and : 
A'^ — )■ such that f^{xi,X2,X3) — {x2,X3,xi). This can be computed via linear combinations: 



V — VjV 



C'{AI) < C'{AI) < C'{AI*) + r'(M), 



T\AI) = \{i : /, = A',e\/x, G GF(q)\{0, 1}}| = \{i : /, ^ e\ f* = e'}\. 



Vi ^ Vi 



+ 



y2 + 2/3 



(= X1+X2 + X3) 



2/1 ^ 2/1 



2/3 ^ 2/1 



2/2 ^ 2/1 
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However, it is impossible to perform this cyclic shift in four instructions by first updating yi and then 
updating 1/2 instead of ys . 

This is an example of the more general result below. 

Proposition 4.6. Let k G Sym(ri) be a cyclic permutation, without loss k = {1,2, ... ,n). Then the 
cyclic shift of n variables f^ : A" — >■ A" can be computed in n + 1 instructions if and only if the order of 
updates (up to starting point) is yi, ?/„,..., 7/2, J/i- 

Proof. Let us prove that if the order is correct, then we can compute the cyclic shift. This is done via 
the following program: 

n 
i=l 

n 

i=2 



n 

yi ^ yi -J^yj- 

i=2 



We prove the correctness of this program by induction: we claim that after the update of yn-i, all variables 
yn, J/n-i, • • ■ , yn-i havc the correct values Xn+i — xi,Xn, . ■ . , Xn-i+i for i from to n — 1. For i = 0, we have 



n n 



yn^yi-^yj = ^x,-Y^ Xj = xi. 

j=2 i=l j=2 

Now suppose it holds for up to i — 1, we then have 

n—i n n n—i n+1 

yn-^ ^ yi~^yj - ^ y^ ^^Xt-^Xj - ^ Xk = Xn-z+l- 

j=2 j=n-i+l i=l j=2 k=n-i+2 



We now prove the reverse implication. Consider a program computing the shift of variables with n + 1 
instructions, and let yi be updated first. Then, for all 1 < fc < n, the update of yi must occur after 
that of yi+i. Indeed, otherwise after yi Xi^i, the content of (yi,j/i+i) is (x^+i, x^+i) and the resulting 
transformation is not a permutation. The only order possible is hence yi, t/„, . . . , yi. □ 

We can then determine the procedural complexity of a manipulation of variables. 

Theorem 4.7. Let </) : [n] — > [n] have F fixed points and D detached cycles. Then the procedural 
complexity of the manipulation of n variables : A" — >■ is exactly 

• n — F + D instructions if (j) is a permutation; 

• n — F + 1 instructions if (j) is not a permutation and D > 0; 

• n — F instructions otherwise. 

Proof. Let us first suppose that is a permutation. Then computing one cycle after the other yields 
a program of length n — F + D hy Proposition 14.61 Conversely, assume that there is a program computing 
in fewer than n — F + D instructions. For this program there must be at least one cycle of cj) such that 
each coordinate in the cycle is updated only once. Then after the first such update yi 4— Xit/,, we have 
yi = yicp = Xitj, and hence the resulting transformation is not a permutation. 

Let us now suppose that (j) is not a permutation. Let m denote the number of variables which are not 
fixed and do not belong to any cycle. The subgraph induced on these vertices is acyclic, hence we can order 
them as ai, . . . , am such that = aj(j) only if i > j [1]. The first part of the program consists in updating 
all these vertices but the last in the correct order: for i from 1 to m — 1, do 
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The second part is to perform the cycles by using as memory. Let {ic : 1 < c < C} denote a member of 
each (detached or attached) cycle of length Ic, then do the following instruction: 

c 

c=l 

Then for all c from 1 to C do 



C-1 C 
b=l b=c+l 

It can be easily proved by induction on c that this program does compute all cycles. Eventually, we need 
the final update of 2/a„- Note that am4> is either a fixed point or it belongs to a cycle; therefore Xa^rf, is 
contained in yam<t>L, where L = if am(t> is a fixed point and i = ~ 1 if it belongs to the cycle c. Thus, 
the final update is given by 

ya,^^ya^4>^- (4.1) 

Since j/q^ is the only coordinate updated twice, this program has length n — F +1. 

We now simplify this program when (p has no detached cycles. This time, for i from 1 to rn, do 

Then for all c from 1 to C, there exists ac G {ai, . . . , a™} such that accj) = ic, therefore do 

Since t/a„ already contains Xa^(f,, there is no need to include the final update in (j4.ip . 

Conversely, it is clear that at least n — F instructions are needed to compute Furthermore, assume 
D > and that there is a program computing in exactly n — F instructions. Let i in the cycle c be the 
first coordinate belonging to a detached cycle to be updated. Then the program first does j/i Xi^ and the 
value of Xi is lost; therefore, the update j/i^ic-i ^ Xi cannot occur. □ 

Theorem 14.71 indicates that disjoint cycles of a permutation cannot be computed "concurrently," for the 
shortest program which computes two cycles exactly consists of computing one before the other. 

Corollary 4.8. Ifn = 2m, then computingm disjoint transpositions of variables (e.g. (1,2)(3,4) • • • (2m— 
1, 2m) ) takes exactly 3m instructions. If n = 2m + 1, then computing m — 1 disjoint transpositions and a 
cycle of length 3, (e.g. (1, 2)(3, 4) • • • (2to — 3, 2m — 2) (2m — 1, 2to, 2m + 1) ) takes exactly 3m + 1 instructions. 
This is the maximum number of instructions for any manipulation of variables. 

In particular, if the entries of an m, x m, matrix over A, then transposing that matrix 

takes exactly 3m(m — l)/2 instructions. 

Another consequence of Theorem l4.7l is that when (p is not a permutation, we can obtain shorter programs 
by using some arithmetic than by adopting the "black box" approach used for the swap of two variables 
escribed in the very beginning of the paper. Figure 14.21 shows the smallest example: computing /"^ takes 
6 instructions when using the program described in the proof of Theorem 14. 7( while it takes 7 instructions 
when we do not combine variables. Clearly, this example can be generalized by adding more cycles, thus 
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(a) <f> (b) Programs for /"^ 

Fig. 4.2. T/ie simplest manipulation of variables using a shorter program with arithmetic 



yielding an arbitrarily large gap between the two approaches. The results are summarised in Proposition[ 
and Corollary [4. 101 

Proposition 4.9. Let <j) be a transformation of [n] with F fixed points and D detached cycles. Then 
the manipulation of variables can be computed without memory by instructions of the form j/j yj for 
any i, j G [n] if and only if 4> is not a permutation (or is the identity). In that case, the shortest length of 
such a program is n — F + D. 

The proof calls arguments similar to those used above and is hence omitted. 

Corollary 4.10. If 4> is not a permutation, then the ratio between the procedural complexity of over 
the minimum length of a program computing using instructions of the form yi yj is always greater 
than 2/3. Conversely, for any e > 0, there exists (j) for which that ratio is between 2/3 and 2/3 + e. 

5. Using memory. Suppose we want to compute a transformation / of A" using m memory cells 
storing values in A. By convention, we shall denote the content of the m memory cells as yn+i, ■ ■ ■ Tyn+m- 
Then computing / using m memory cells is equivalent to computing some transformation h(xi, . . . , Xn+rn) 
such that the first n coordinate functions of h coincide with those of /. Let us denote the set of such 
transformations as D(f,m). The shortest length of a program computing / using m memory cells is hence 
given by 

£(/|m) min Cih). 

Therefore, there exists h such that C{h) = £{f\m) but it may be difficult to characterise that transformation 
h. However, Proposition 15.11 shows that our framework also considers the case of using memory. Indeed, 
there is a deterministically (and easily) described transformation h E D{f,m) for which C{h) and C{f\m) 
are in bijection. 

Proposition 5.1. For any transformation f of A"^ and any e = (ei,...,em) € let g D{f,m) 
and = Ci for I < i < m. Then 



C{h^) ^ C{f\m) + m. 



Proof. Let g G D{f,m) such that C{g) ~ £(/|m), then the shortest program computing g appended 
with the suffix yn+i ^ ct for i from 1 to to has length C{f\m) + to and computes h'^. Therefore, C{h'^) < 
C{f\m) + m. 

Conversely, consider the shortest program computing h"^. It contains to final updates yn+i ^ Ci which, 
without loss, appear for i from to down to 1. Then any instruction yj +- g{y) occurring after yn+k ^ &k 
(hence j <n^ k — 1) can be replaced by yj +- g'{yi, . . . , 2/n+fe-i) where g' : ^ ig defined as 

g'iyi, ■ • ■ ,2/n+fe-i) = 5(2/1, • ■ • ,2/«+fc-i,efc, . . . ,e„i). 

Now remove all the yn+i ^ &i updates; we are left with a program which computes some transformation in 
D(/, to) and whose length is given by C{h'^) — to. Thus C(f\m) < C{h'^) — to. □ 
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5.1. Shorter programs. We have shown m Theorem 12.41 that one need not use memory to compute 
any transformation. However, we shall prove that one may want to use memory in order to use shorter 
programs. In order to clarify notations, whenever m = 1, we denote the content of the memory cell as t. 

We have shown in Theorem 13.51 that any permutation can be computed without memory in at most 
2n — 1 instructions. On the other hand, using one memory cell necessarily yields a program with length at 
least n + 1. Propositions 13.21 and 15.21 show that these two results are simultaneously tight: there exists a 
permutation / G Sym(j4") for which £(/) = 2ri — 1 while C{f\l) = n + 1. 

Proposition 5.2. The transposition {a,b) of two words a, & G A" at Hamming distance d can be 
computed with one memory cell in d + I instructions: C{{a, b)\l) — d + 1. 

Proof. Without loss, let us assume that a and b disagree on their first d coordinates. Then the following 
program computes (a, 6): 

t^ 5{y,a) - 6{y,b) 
2/1^2/1 + {bi - ai)t 

Vd^ yd + [bd - ad)t. 

□ 

In Theorem 13.131 we have given an upper bound on the complexity of any transformation which only 
depends on the number of variables. This upper bound is larger than 2n — 1 obtained for permutations; 
however, using memory cells yields a program using 2n — 1 instructions, as seen below. 

Proposition 5.3. Any transformation f of A"' can be computed with n—1 memory cells and no more 
than 2n — 1 instructions: C{f\n— 1) < 2n ~ 1. 

Proof. The following program computes / using n—1 memory cells ii, . . . , and 2n — 1 instructions: 

h ^ yi 

yi ^ fiih,- ■ . 

yn ^ fnih, ■ ■ . 

□ 

Proposition 15.31 indicates that we do not need any more than n ~ 1 memory cells. Indeed, if we use n 
memory cells, then the program will have at least 2n instructions (unless some memory cells are not updated, 
which is equivalent to not using them). Therefore, £(/|m) = C{f\n — 1) for any m > n — 1. 

We remark that this upper bound on the amount of memory needed follows from the fact that we allow 
any instruction. In practice, using a large amount of memory is the price paid for using only a restricted 
number of basic instructions. 

This can be easily generalised to the case where / only has k nontrivial coordinate functions. In that 
case, using k — 1 memory cells yields a program of length at most 2fc — 1 instructions, and hence only k — 1 
memory cells are needed. 

The ideas behind Theorem 13.51 can be adapted to the case of using memory to yield a refinement of 
Proposition 15.31 for permutations. 

Theorem 5.4. Any permutation of A"' can be computed in at most 3m instructions with m memory 
cells if n = 2m is even and at most 3m + 3 instructions with m + 2 memory cells if n — 2m + 1 is odd. 

Proof. Suppose n — 2m, let / £ Sym(A") and let ii, . . . ,tm denote the memory. By Proposition 13.41 
there exist m functions gi, . . . , g^ '■ A"' — ^ A such that 

(/l, • ■ • , /m,5l: • ■ • ;5m) and {Xm+1, ■ ■ ■ , Xn, gi, ■ ■ ■ , g-m) 

both form permutations of A" . The program goes as follows: 
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• Step 1 (m instructions). For i from 1 to m, do U ^ gi{x). 

• Step 2 (m instructions). For i from 1 to m, do j/i ^ fi{x)- This is possible since (aim+i, . . . ,Xn, gi, ■ ■ ■ , gn 
form a permutation of A", and hence fi{x) can be expressed as a function of {ym+i, ...,?/„, ti, im)- 

• Step 3 (m instructions) . For i from rn+1 to n, do j/i ^ /i(a;). This is possible since (/i, ...,/,„, gi, (7„ 
form a permutation of A", and hence /i(x) can be expressed as a function of (yi, . . . , ym, ti, . . . , tm)- 

Now let n — 2m + 1 be odd. Then add one memory cell and consider the extended permutation 
g e D{f, 1) such that g2m+2{x) = X2m+2- Then g can be computed in 3to + 3 instructions and m + 1 memory 
cells. □ 

Therefore, we do not want more than around n/2 memory cells to compute any permutation; adding 
any more would be superfluous. There is a linear analogue to Theorem 15.41 

Proposition 5.5. Any linear permutation of can be computed in at most Sm linear instructions 
with m memory cells if n — 2m is even and at most 3m + 3 linear instructions with m + 2 memory cells if 
n = 2m + I is odd. 

Proof. Suppose n = 2m. Let f{x) = xM^ and denote the first m rows of M as Mi and the matrix 
J = (0|/„i) e We claim that there exists a matrix N € such that {M^,N^) and {J^,N^), 

both in yl"^"^ are nonsingular. Then the algorithm simply places TV in the memory, then replaces the first 
TO rows by Mi, and finally updates the last m rows to those of M. 

We now justify our claim. This is equivalent to showing that for any two subspaces in the Grassmannian 
G{q, 2m, to) of m-dimensional subspaces of GF{q)'^™', there exists a third subspace in the same Grassmannian 
at subspace distance 2m from both [Til (where the subspace distance between U,V G G{q, 2m, to) is given 
by 2dim([/ + V) ~ 2m). Since the Grassmannian endowed with the subspace distance forms an association 
scheme [TU], we only have to check for the row space of J and one subspace at distance 2d for each < d < to. 
Let us then assume Mi = (Om-d\Im\(^d) whose row space is at subspace distance 2d from that of J. Then it 
is easily checked that the row space of 



N 



is at distance 2to from the row spaces of Mi and J. 

The case n = 2to + 1 is settled by considering M' e j^n+ixn+i gjyg^ j-,y 

M' = 



M 


^ 








□ 

For manipulations of variables, we can completely determine the gain offered by using memory. 

Example. Let tt — (1,2) (3, 4) £ Sym(4) and let : A'^ be the corresponding permutation 

of variables. By Corollary 14. 8[ two disjoint transpositions of variables must be computed in at least 6 
instructions when no memory is used. However, adjoining one memory cell t leads to a program with only 
5 instructions, as seen below. 

t ^ yi + ya (= a;i + x^) 
yi ^ 2/2 (= 2^2) 

2/2 ^ t-ys (= xi) 
2/3 ^ 2/4 (== 3^4) 

2/4 ^ t-y2 (= 2:3) 
More generally, we can show that using only one memory cell is sufficient to compute any manipulation 
of variables. 

Proposition 5.6. Any manipulation of n variables with F fixed points can be computed with one 
memory cell in at most n — F + 1 instructions. 

Proof. By Theorem 14. 7[ we only need to prove the case where is a permutation of [n]. Let tt be the 
transformation of [n + 1] defined as iir = icj) for all i € [n] and (n + l)7r = 1. Then by Theorem 14.71 we 
can compute in n — F + 2 instructions, where the last instruction updates 2/n+i- By removing that last 
instruction, we compute in n — _F + 1 instructions while using one memory cell yn+i- D 
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By comparing with Theorem 14.71 we see that using only one memory cell reduces the length of the 
program from n—F+C to n—F+1 for permutations. In particular, for a disjoint product of m transpositions, 
the complexity goes down from 3m to only 2m + 1. 

5.2. Binary instructions. Since the number of instructions is very large, one may want to use only a 
subset of instructions to compute any transformation. A natural choice is that of binary instructions, since 
any function can be computed as a composition of binary operations. 

Definition 5.7. An instruction yi -h- gi{y) is binary if g only involves at most two variables: gi{y) ~ 
giiVjiVk) for some j, fee [n]. 

Using binary instructions is not sufficient when computing without memory; however, it is sufficient 
when only one memory cell is used. 

Theorem 5.8. If A — GF(2), then the set of all permutations of A"' which can be computed using 
binary instructions is the affine group Aff(n, 2). On the other hand, when using one memory cell, any 
transformation over any alphabet can be computed by binary instructions. 

Proof. Note that any binary permutation instruction is of the form yi gi{yi,yj) for some j £ [n\. If 
A = GF(2) and n = 2, then it is wefi known that Sym(GF(2)2) = Aff(2, 2). If n > 2, then any instruction 
of the form yi <— g{yi,yj) must correspond to a binary instruction for GF(2)^ acting on the coordinates yi, 
yj'. it is also affine. Therefore, the group generated by binary permutation instructions is affine. Conversely, 
extending Gaussian elimination to the affine case shows that any affine permutation can be computed via 
binary instructions. 

If the memory cell t is used, we claim that the instructions in Theorem 12.41 can be computed by binary 
instructions. For the sake of simplicity, let us assume i = 1. For any u € A" and v = u + e^, we can 
decompose 

S{y, u) = S(yi,ui)6(y2,U2) ■ ■ ■ 5{yn,Un), 
S{y,u) - d{y,v) = {6{yi,ui) - 5{yi,vi))d{y2,U2) ■ ■ ■ (5(y„,M„). 

Then the transposition (u, v) is computed as follows: 

t ^ S{yi,ui) - S{yi,vi) 

t <r- t5{y2,U2) 
t t6{yn,Un) 

yi^yi+ t- 
and the assignment (e*^ — > e^) is computed as: 

t^5{yi,0) 
t^t6{y2,0) 

t^t6{yn,0) 

yi^yi + 1. 

Since any transformation can be computed using these two types of instructions, it can be computed with 
binary instructions. □ 
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